On the Usefulness of Opponent Modeling: the Kuhn Poker case study (Short Paper)

نویسندگان

  • Alessandro Lazaric
  • Mario Quaresimale
  • Marcello Restelli
چکیده

The application of reinforcement learning algorithms to Partially Observable Stochastic Games (POSG) is challenging since each agent does not have access to the whole state information and, in case of concurrent learners, the environment has non-stationary dynamics. These problems could be partially overcome if the policies followed by the other agents were known, and, for this reason, many approaches try to estimate them through the so-called opponent modeling techniques. Although many researches have been devoted to the study of the accuracy of the estimation of opponents’ policies, still little attention has been deserved to understand in which situations these model estimations can be actually useful to improve the agent’s performance. This paper presents a preliminary study about the impact of using opponent modeling techniques to learn the solution of a POSG. Our main purpose is to provide a measure of the gain in performance that can be obtained by exploiting information about the policy of other agents, and how this gain is affected by the accuracy of the estimated models. Our analysis focus on a small two-agent POSG: the Kuhn Poker, a simplified version of classical poker. Three cases will be considered according to the agent knowledge about the opponent’s policy: no knowledge, perfect knowledge, and imperfect knowledge. The aim is to identify which is the maximum error that can affect the model estimate without leading to a performance lower than that reachable without using opponent-modeling information. Finally, we will show how the results of this analysis can be used to improve the performance of a reinforcement-learning algorithm coped with a simple opponent modeling technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the usefulness of opponent modeling: the Kuhn Poker case study

The application of reinforcement learning algorithms to Partially Observable Stochastic Games (POSG) is challenging since each agent does not have access to the whole state information and, in case of concurrent learners, the environment has non-stationary dynamics. These problems could be partially overcome if the policies followed by the other agents were known, and, for this reason, many app...

متن کامل

An Experimental Approach to Online Opponent Modeling in Texas Hold'em Poker

The game of Poker is an excellent test bed for studying opponent modeling methodologies applied to non-deterministic games with incomplete information. The most known Poker variant, Texas Hold'em Poker, combines simple rules with a huge amount of possible playing strategies. This paper is focused on developing algorithms for performing simple online opponent modeling in Texas Hold'em. The oppon...

متن کامل

Poker Opponent Modeling ∗ Michel Salim and Paul Rohwer

Utilizing resources and research from the University of Alberta Poker research group, we are investigating opponent modeling improvements. Currently, our simple poker bot plays online against instantiations of PokiBots, the poker machine created by the University of Alberta research group. After some decision rule building, our poker bot is competitive. Our next step is to build upon this resea...

متن کامل

Particle Filtering for Dynamic Agent Modelling in Simplified Poker

Agent modelling is a challenging problem in many modern artificial intelligence applications. The agent modelling task is especially difficult when handling stochastic choices, deliberately hidden information, dynamic agents, and the need for fast learning. State estimation techniques, such as Kalman filtering and particle filtering, have addressed many of these challenges, but have received li...

متن کامل

Active Sensing for Opponent Modeling in Poker

One approach to designing an intelligent agent capable of winning competitive games such as Texas hold’em poker is to use opponent modeling to learn about an opponent’s behavior, then exploit that knowledge to maximize long term winnings. However, opponent modeling can suffer from several problems, including slow convergence due to a lack of a priori knowledge, noisy or dynamic opponent behavio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008